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1.  Introduction 

One  goal  of  software  engineering  is  to  find  efficient  tools  which  produce  from  a  program 
quantitative  life  cycle  information  which  has  a  well-understood  interpretation.  Program  mutation 
[  14.  2.  13]  is  such  a  tool,  its  goal  is  to  provide  a  measure  of  how  well  data  T  has  tested  the  functional 
correctness  of  program  P. 

In  program  mutation  a  set  of  alternative  programs  Mi. M2. -  Mm.  called  mutants,  are  constructed 
from  P  and  run  against  T.  Ideally,  each  M,  is  functionally  different  from  P  --  each  mutant  represents  a 
potential  error  in  P.  Assuming  P  runs  acceptably  on  T.  M,  failing  on  T  indicates  that  P  does  not 
contain  the  error  represented  by  M,.  Thus,  a  quantitative  measure  of  how  well  P  has  been  tested  by  T . 
modulo  the  represented  potential  errors,  is  given  by  the  percentage  of  failing  mutants. 

There  have  been  three  experimental  program  mutation  systems  built,  two  for  testing  Fortran 
programs  [6.  5],  called  FMS.l  and  FMS.2.  and  one  for  testing  Cobol  programs  [2.  1],  called  CMS.  I 
The  goal  of  each  system  has  been  to  conduct  experiments  aimed  at  providing  interpretations  of  the 
mutant  failure  percentage  -  that  is.  determining  exactly  what  types  of  common  programming  errors 
can  cannot  be  detected  via  mutations  and  comparing  the  relative  strength  of  program  mutation 
testing  to  the  other  contemporary  testing  methods  such  as  symbolic  execution  [II.  10.  16]  and 
coverage  measures  [19,  18,23],  The  results  of  these  experiments  have  been  reported  in  [22.2.7.  1.8] 
It  is  noted  that  on  these  systems  programs  having  lengths  of  up  to  1000  statements  have  been  tested 
under  experimental  conditions.  [15]  represents  the  most  comprehensive  test  by  program  mutation  to 
date. 

All  three  mutation  systems  have  approached  the  problem  of  constructing  mutants  by  defining  a  set 
of  from  25  to  30  mutant  operators.  A  mutant  operator  is  a  simple  syntactic  or  semantic  program 
transformation  such  as  changing  a  particular  relational  operator  to  one  of  the  five  other  relational 
operators,  changing  the  semantics  of  a  particular  Fortran  DO  loop  to  act  as  an  Algol  FOR  loop,  or 
changing  a  particular  variable  reference  name  to  be  one  of  the  program's  other  named  variables  of 
compatible  type.  All  three  mutation  systems  have  implemented  only  first-order  mutations  which 
come  from  a  single  application  of  a  mutant  operator  on  the  program  P  Analysis  has  shown  that  the 
number  of  mutants  generated  by  these  systems  is  on  the  order  of  V  where  N  is  the  number  of 
statements  in  P2  [22.  8], 

The  method  of  mutant  operations  was  chosen  because  it  is  conceptually  simple  and  easy  to 
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implement  However,  using  general  program  transformations  introduces  a  new  problem  -  the 
generation  of  mutants  which  are  functionally  equivalent  to  the  given  program  P.  These  equivalent 
mutants  are  a  nuisance  since  they  add  no  power  to  the  mutant  test  and  thus  must  be  diligently 
accounted  for  during  experiments.  Furthermore,  during  some  experiments  as  many  as  lOH  of  the 
mutants  have  turned  out  to  be  equivalent.  A  method  to  automatically  detect  some  types  of  equi\  alent 
mutants,  based  on  compiler  optimization  techniques,  has  been  designed  but  it  has  yet  to  be 
implemented  [4], 

The  experiments  run  on  these  three  prototype  program  mutation  systems  have  shown  that  the 
method  will  uncover  virtually  all  errors  detected  by  contemporary  testing  methods  and  have  further 
shown  that  program  mutation  has  the  potential  to  detect  some  errors  which  are  overlooked  by  all 
other  methods.  However,  due  to  the  relatively  large  number  of  mutants  generated  it  remains  to  be 
seen  whether  or  not  the  method  can  be  implemented  efficiently  enough  to  be  put  into  a  production 
environment.  This  is  further  aggravated  by  the  above  equiv  alent  mutants  problem. 

2.  Potential  Speed-Up  Methods 

All  current  program  mutation  systems  execute  mutants  in  their  entirety  on  test  data.  For  test  data 

T  consisting  of  n  test  cases  li.lj . I„,  the  program  P  is  first  executed  to  produce  corresponding  output 

Oi.Oj . On.  In  functional  notation,  O,  =  P( I,).  A  mutant  M,  is  said  to  fail  if  for  any  test  case  i.  either 

M,  has  a  run-time  exception  or  M,(l,)  is  different  from  P(I,).  The  main  loop  of  current  mutation 
systems  is  illustrated  in  figure  2-1 .  Details  will  be  given  in  section  3 

In  this  report  we  will  discuss  how  the  mutation  test  can  potentially  be  more  efficiently  implemented 
by  overviewing  the  design  of  a  new  prototype  program  mutation  system  which  radically  differs  from 
all  previous  systems.  Rather  than  tackle  a  new  language,  we  will  concentrate  on  performing  program 
mutation  on  Fortran  programs.  The  following  four  potential  speed-up  factors  will  be  incorporated. 

2.1.  Distributed  Computation 

Figure  2-1  illustrates  the  sequential  nature  of  current  program  mutation  systems  However,  there  is 
nothing  inherently  sequential  in  the  method.  Indeed,  the  nature  of  mutation  analysis  in  which  several 
mutants  are  independently  run  against  test  data  suggests  that  the  method  is  a  "natural"  for  distributed 
computing.  For  a  distributed  system  with  r  processors,  one  can  conceive  of  routing  mutants  and  test 
data  to  free  processors  with  a  potential  speed-up  factor  of  r.  The  method  is  particularly  suited  to  local 
area  networks  of  personal  machines  [20], 

The  detailed  design  of  a  prototype  distributed  program  mutation  system  should  focus  on  the 
communication  issues  related  to  realizing  as  close  as  possible  the  potential  r  speed-up  factor  afforded 
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by  distributed  computing.  An  overview  of  the  system  design  and  an  implementation  strategy  will  be 
given  in  sections  4  and  5. 

2.2.  Automatically  Detecting  Equivalent  Mutants 

Figure  2-1  illustrates  the  computational  waste  of  equivalent  mutants.  Assume  that  M:  is  not 
equivalent  to  P  and  that  Mj  will  fail  on  some  I,.  As  soon  as  M,  fails  it  is  never  again  executed. 
However,  if  M,  is  equivalent  to  P  then  M,  will  be  executed  on  each  1,  since  M,  can  never  fail'’. 

In  a  production  system  virtually  all  equivalent  mutants  must  be  automatically  detected  for  reasons 
other  than  efficiency:  if  equivalent  mutants  are  not  accounted  for.  then  the  interpretation  of  the 
mutant  failure  percentage  can  be  misleading.  Furthermore,  it  has  been  observed  that  manually 
detecting  equivalent  mutants  can  require  large  amounts  of  human  effort  [2.  15],  and  manual 
detection  of  equivalent  mutants  is  an  error-prone  human  activity  which  again  can  lead  to 
misinterpretation  of  the  mutant  failure  percentage. 

The  design  of  our  prototype  distributed  system  will  incorporate  the  detection  method  designed  in 
[4],  To  experiment  on  how  well  the  method  works  and  to  improve  the  method  one  would  use  a  data 
base  of  programs  with  known  equivalent  mutants  as  has  been  collected  in  [9,  7],  It  has  been 
estimated  that  as  many  as  95%  of  the  equivalent  mutants  can  be  detected  automatically  [1.8]  Aside 
from  improving  this  figure,  experiments  could  suggest  strategies  to  help  manually  detect  the 
remaining  small  number  of  equivalent  mutants. 

2.3.  Partial  Mutant  Execution 

As  stated  above  and  illustrated  in  figure  2-1,  the  current  mutation  systems  execute  mutants  in  their 
entirety  on  test  data.  In  the  design  of  the  prototype  distributed  system  we  will  incorporate  an 
execution  time  saving  feature  of  commencing  mutant  execution  at  the  point  of  mutation 

The  method  can  best  be  illustrated  through  a  straight  line  program.  Assume  P  is  an  N  statement 
straight  line  program  with  statements  Si.S2,...,Sn.  Let  T  be  a  test  case  for  P  with  input  1  and  output 
O.  Let  D  represent  all  data  variables  accessed  by  P  and  denote  by  D,  the  state  of  D  after  S,  has  been 
executed.  Figure  2-2  illustrates  this  notation.  Then  if  mutant  M,  affects  statement  Sk  but  doesn't 
affect  statements  Si.S2,....Sk-i  we  can.  without  loss  of  generality,  begin  the  execution  of  M,  with  data 
state  Dk  i  at  mutated  statement  Sk. 

The  major  obstacle  to  be  overcome  in  implementing  this  speed-up  method  lies  in  how  to  store  and 

3An  equisalent  M,  could  tail  due  to  a  runtime  exception  Experience  indicates  that  this  is  extremely  rare 
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retrieve  P's  intermediate  data  values.  The  straight-line  program  example  illustrates  some  issues  and 
ideas  to  explore  in  designing  the  prototype  distributed  system.  Clearly  storing  Do,  ,Dn  would  work 
but  would  require  a  factor  of  N  additional  storage.  However,  it  is  easy  to  see  that  if  the  values 
computed  by  the  S’s  (rather  than  the  D’s)  are  stored  in  a  reversed  linked-list  fashion,  then  Dk  can  be 

constructed  without  executing  Si,Sj . Sk.  This  approach  can  be  extended  to  tree  and  acyclic  graph 

programs,  but  breaks  down  for  arbitrary  program  structures. 

It  has  been  shown  that  a  large  variety  of  program  structures,  usually  restricted  to  intraprocedure 
analysis,  can  be  reduced  to  the  above  three  forms  [17],  In  the  prototype  distributed  program 
mutation  system  we  will  explore  using  the  above  techniques  at  the  node  level  of  the  reduced  flow 
graph  representation  of  Fortran  subroutines.  We  will  also  need  to  construct  and  use  the  procedure 
call  graph  as  defined  in  [17].  It  might  be  necessary  to  commence  mutant  execution  in  the  calling 
ro •:  te  for  test  cases  which  cause  subroutines  to  be  called  many  times. 

There  is  another  saving  of  execution  time  that  can  be  realized  with  the  above  method  of 
commencing  mutant  execution  at  the  point  of  mutation.  Referring  again  to  figure  2-2,  assume 
mutant  M,  affects  statement  Sk  but  doesn’t  affect  any  other  statements.4  Then  we  not  only  can  begin 
execution  of  M,  at  mutated  statement  Sk  but  we  can,  without  loss  of  generality,  terminate  the 
execution  of  M,  after  executing  mutated  Sk  providing  the  Dk  data  state  of  the  mutant  matches  the  Dk 
data  state  of  the  program  P  -  Mj  cannot  fail.  The  case  where  the  data  states  are  unequal  will  be 
discussed  in  the  next  subsection. 

In  the  detailed  design  of  the  distributed  mutation  system,  one  should  instrument  measurement 
techniques  in  an  effort  to  estimate  how  much  of  a  saving  is  being  realized  due  to  partial  mutant 
execution. 

2.4.  A  "Weak"  Mutation  Option 

It  was  seen  above  that  mutants  can  be  terminated  at  the  point  of  mutation  provided  the  mutant's 
data  state  matches  the  program’s  at  that  point.  If  they  don’t  match,  however,  the  mutant  cannot  be 
terminated  and  marked  as  failing  since  at  some  later  point  in  its  execution  its  data  state  could  return 
to  match  the  program's. 

It  would  be  unwise  to  continuously  monitor  a  mutant's  data  state  to  see  if  it  has  returned  to  match 
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the  program  since  we  intuitively  fee!  that  such  returns  are  rare.5  Marking  a  mutant  as  failing  il  it 
doesn't  match  the  program  at  the  point  of  mutation  will  be  called  weak  program  mutation  Two 
things  are  clear  about  weak  program  mutation:  1)  it  can  be  implemented  much  more  efficiently  than 
program  mutation,  and  2)  it  cannot  give  more  information  on  the  correctness  of  a  program  than 
program  mutation.  What  is  not  clear  is  how  much  weaker  is  weak  program  mutation  than  program 
mutation. 

A  prototype  distributed  program  mutation  system  should  have  weak  program  mutation  as  an 
option  This  would  allow  me  to  conduct  experiments  on  the  above  question.  The  experiments  would 
be  of  a  "beat  the  system”  nature  [7]  in  which  a  subject  takes  programs  with  known  errors  and  tries  to 
develop  test  data  on  which  the  program  doesn't  fail  but  on  which  all  mutants  of  the  program  fail. 
Initially,  the  same  programs  which  have  been  used  in  beat  the  system  experiments  on  program 
mutation  [9]  should  be  used  for  beat  the  system  experiments  on  weak  program  mutation.  This  will 
allow  a  comparison  of  program  mutation  and  weak  program  mutation  on  known  results  Later,  the 
improved  efficiency  of  the  prototype  distributed  mutation  system  would  allow  one  to  conduct 
experiments  on  programs  of  much  larger  size 

3.  Design  Overview  of  Current  Mutation  Systems 

The  three  existing  program  mutation  systems  all  have  the  same  basic  design  [6.  5],  There  are  six 
major  modules: 

1.  A  parser  --  the  program  to  be  tested  is  parsed  into  an  internal  form  which  is  suitable  for 
program  mutation. 

2.  An  Interpreter  --  executes  internal  form  representations  of  programs  and  mutants 
Detects  various  run-time  failure  exceptions. 

3.  A  Test  Case  Manager  --  controls  execution  of  the  program  on  the  test  data  and  record' 
the  output  of  the  program. 

4.  A  Mutant  Generater  --  applies  the  mutant  operators  to  the  program  to  generate  mutant 
descriptors  which  indicate  what  changes  to  the  internal  form  constitute  a  mutation 

5.  A  Mutant  Manager  --  uses  the  mutant  descriptions  to  create  mutants  by  altering  the 
internal  form,  cont-ols  execution  of  the  mutants  on  the  test  data,  and  maintains  tables 
which  indicate  the  failure  status  of  mutants. 

6.  A  Report  Generater  -  creates  a  printable  summary  of  the  testing  run 

From  the  descriptions  of  these  modules  it  can  be  seen  that  there  are  four  major  data  structures  in 


’intuitively,  if  a  mutant  will  return  to  match  the  program  then  it  wili  do  so  ouickly  -  witl.ir.  the  next  few  statements  This  f  its 
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current  mutation  systems: 

! .  The  interna!  form  representation  of  the  program. 

2.  The  test  cases  and  the  program's  output  on  them. 

3.  The  mutant  descriptors,  and 

4.  The  mutant  status  tables. 

A  testing  run  on  current  mutation  systems  can  be  broken  down  into  three  relatively  independent 
phases: 

!.  Phase  1  —  The  program  is  parsed,  the  program  is  executed  on  the  test  data,  and  the 
mutant  descriptors  are  generated. 

2  Phase  2  —  The  mutants  are  executed  on  the  test  data. 

3.  Phase  3  -  The  report  is  generated. 

Figure  3-1  summarizes  the  design  and  how  testing  runs  are  done  on  the  current  program  mutation 
systems  in  terms  of  the  components  described  above.  Note  that  in  testing  a  program  P.  the  systems 
allow  the  test  data  to  be  augmented  without  redoing  what  has  been  previously  done  in  phase  1 
Furthermore,  the  user  has  the  option  of  applying  the  mutant  operators  incrementally  from  run  to  run 
rather  than  dealing  with  all  mutants  from  the  outset.  There  are  three  places  indicated  in  figure  3-1 
where  a  testing  run  may  stirt: 

(A)  Start  point  for  the  first  testing  run. 

(B)  Start  point  for  subsequent  runs  involving  new  test  data, 
and  possibly  the  application  of  more  mutant  operators. 

(C)  Start  point  for  subsequent  runs  involving  no  new 
test  data  but  applying  more  mutant  operators. 

4.  Design  Overview  of  the  Prototype  Distributed  Mutation  System 

The  prototype  distributed  mutation  system  will  still  have  the  three  phases  of  current  mutation 
systems.  Implementing  the  ideas  of  section  2  will  require  some  new  components  as  well  as  major 
design  overhauls  to  some  existing  components.  There  will  be  two  new  components: 

1 .  A  Program  Flow  Graph  —  this  data  structure  will  indicate  where  and  how  partial  mutant 
execution  can  be  done.  It  will  also  indicate  the  basic  blocks  of  the  program 

2.  An  Equivalence  Tester  --  this  module  will  use  the  basic  block  information  of  the 
program  flow  graph  to  mark  mutant  descriptors  as  equivalent. 

Five  components  of  existing  mutation  systems  need  substantial  revisions  They  are 

1 .  The  Parser  -  the  program  flow  graph  will  be  generated  by  the  parser. 

2.  The  Internal  Form  -  the  internal  form  representation  of  the  program  will  now  include 
the  program  How  g-aph. 
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3.  The  Test  Case  Manager  --  in  order  to  do  partial  mutant  execution  it  will  be  necessary  for 
the  lest  case  manager  to  record  data  state  information  on  the  program  at  the  points 
indicated  by  the  flow  graph. 

4.  The  Test  Cases  --  n  addition  to  the  input  output  information,  the  test  cases  will  now 
contain  the  intermediate  information  necessary  for  partial  mutant  execution. 

5  The  Mutant  Manager  --  the  mutant  manager  will  now  control  partial  execution  of 
mutants,  both  in  starting  mutant  execution  at  the  point  of  mutation  and  in  ending 
mutant  execution  immediately  thereafter  in  the  case  of  weak  program  mutation.  In 
addition,  the  mutant  manager  will  control  the  parallel  execution  of  mutants  The 
distributed  aspects  of  the  mutant  manager  will  be  elaborated  below. 

The  phase  1  design  overview  of  the  prototype  distributed  system  is  illustrated  in  figure  4-1  Note 
that  we  could  have  "parallelized"  phase  1  to  take  advantage  of  a  distributed  system.  We  choose  to 
avoid  the  complexities  of  doing  so  because  the  execution  time  spent  in  phase  1  is  insignificant  with 
respect  to  the  execution  •'•me  spent  in  phase  2. 

To  describe  the  distributed  aspects  of  the  prototype  distributed  system,  we  will  use  the  standard 
parallel  processing  abstraction  terms  of  process,  message  passing,  father  process,  and  son  process,  in 
order  to  describe  the  design  structure  independently  of  any  particular  system  architecture  An 
implementation  strategy  w  ill  be  described  in  the  next  section. 

The  mutant  manager  will  be  a  father  process  capable  of  creating,  managing,  and  communicating 
with  an  arbitrary  number  of  identical  son  processes  called  mutant  executers  The  mutant  manager 
will  exist  for  the  duration  of  phase  2  but  the  mutant  executers  may  come  and  go  The  mutant 
executers  operate  independently  of  each  other,  don't  communicate  with  each  other,  and  don't  know 
or  care  about  the  existance  of  other  mutant  executers.  The  only  communications  are  between  the 
mutant  manager  and  the  mutant  executers.  Figure  4-2  illustrates  this  process  structure 

Upon  creation,  a  mutant  executer  will  contain  code  for  communicating  with  the  mutant  manager, 
the  internal  form  representation  of  the  program,  and  the  interpreter.  These  components  remain 
resident  for  the  entire  existance  of  a  mutant  executer.  After  creation,  the  mutant  manager  passes  to 
the  mutant  executer  one  test  case  which  will  be  resident  in  the  executer  for  quite  some  time  This  is 
done  for  two  reasons:  the  test  case  can  constitute  much  data  and  we  wish  to  minimize 
communications,  and  we  want  the  mutant  executer  to  "selt-optimize"  itself  for  executing  a  particular 
test  case  and  this  will  be  a  time  consuming  activity. 

After  these  two  steps  the  mutant  executer  is  ready  to  create  and  execute  mutants  For  simplicity, 
this  will  be  done  one  mutant  at  a  time  -  the  mutant  manager  will  pass  a  mutant  descriptor  to  the 
mutant  executer  and  then  wait  for  a  message  indicating  whether  or  not  the  mutant  has  failed  Note 
that  all  communication  between  the  manage'  and  the  executer  will  be  of  a 
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command-acknowledgement  nature.  In  addition,  the  manager  can  also  expect  messages  from  the 
executer  such  as  ’I  am  about  to  cease  to  exist'. 

5.  An  Implementation  Strategy 

The  implementation  of  the  prototype  distributed  mutation  system  can  be  done  in  three  successive 
stages,  each  building  on  the  former  with  the  first  stage  building  on  the  current  FMS  2  program 
mutation  system.  The  completion  of  stage  one  would  permit  the  performance  of  the  experiments  on 
weak  program  mutation  which  were  outlined  in  section  2. 

$.1.  Stage  I 

During  this  stage  FMS. 2  would  be  modified  to  implement  the  equivalence  tester  and  partial  mutant 
execution.  The  necessary  changes  were  summarized  in  section  4.  There  will  be  no  introduction  of 
parallelism  during  stage  1 . 

5.2.  Stage  2 

During  this  stage  the  mutant  manager,  the  mutant  executer,  and  the  communication  mechanism 
between  them  would  be  built  and  the  system  would  evolve  toward  being  of  a  distributive  nature. 
Depending  on  the  available  facilities,  it  might  not  be  necessary  to  actually  use  or  simulate  distributed 
hardware  during  this  stage.  For  example,  a  DEC-2060  running  the  TOPS-20  operating  system  [  1 2]  is 
particularly  suited  for  building  t.’ir  ^ype  o<  distributed  system  which  we  have  described  since  it 
supports  a  tree-structured  hirrarcF-  if  asynchronous  processes,  and  interprocess  message  passing 
Thus  ir  stage  2  it  is  recommended  that  the  prototype  distributed  mutation  system  be  built  on  a  single 
processor  machine  under  an  operating  system  which  supports  in  software  a  realistic  version  of 
distributed  computation. 

5.3.  Stage  3 

During  this  stage  any  artificiality  in  the  distributed  mutation  system  can  be  removed  by  converting 
it  to  run  on  a  multiprocessor  system.  There  are  several  such  systems  currently  available.  For 
example,  at  Yale  there  are  two  DEC-2060’s  connected  via  an  Ethernet-like  [20]  local  area  network 
called  Chaosnet  [21].  Building  the  prototype  mutation  system  on  this  network  would  allow  an 
examination  of  the  communication  issues  involved  in  running  mutant  executers  on  different 
processors. 

Unfortunately,  for  the  above  network  of  two  DEC-2060’s,  the  mutant  manager  would  be  on  the 
same  machine  as  one  of  the  mutant  executers.  The  artificiality  that  this  imposes  will  be  minimal  since 


the  mutant  manager  will  be  dormant  most  of  the  time.  However,  even  this  small  degree  of  artificiality 
can  be  avoided  if  one  has  available  a  local  area  network  consisting  of  many  powerful  personal 
computers  such  as  the  recently  announced  Apollo  machine  [3]. 
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